Goto

Collaborating Authors

 Łódź


SMCLM: Semantically Meaningful Causal Language Modeling for Autoregressive Paraphrase Generation

arXiv.org Artificial Intelligence

This article introduces semantically meaningful causal language modeling (SMCLM), a selfsupervised method of training autoregressive models to generate semantically equivalent text. Our approach involves using semantically meaningful text representation as an initial embedding in the autoregressive training and generation processes. The extensive empirical study demonstrates that the SMCLM approach makes autoregressive models capable of learning robust and high-quality paraphrase generation. The proposed method is competitive with the supervised method and achieves state-of-the-art results in unsupervised approaches. This article also presents a comprehensive set of automatic metrics that cover a wide range of autogenerated paraphrase evaluation aspects. Simultaneously, this article highlights the low reliability of the metrics that are widely used in paraphrase generation evaluation, including BLEU, ROUGE, and BERTScore.


Estimation methods of Matrix-valued AR model

arXiv.org Machine Learning

This article proposes novel estimation methods for the Matrix Autoregressive (MAR) model, specifically adaptations of the Yule-Walker equations and Burg's method, addressing limitations in existing techniques. The MAR model, by maintaining a matrix structure and requiring significantly fewer parameters than vector autoregressive (VAR) models, offers a parsimonious, yet effective, alternative for high-dimensional time series. Empirical results demonstrate that MAR models estimated via the proposed methods achieve a comparable fit to VAR models across metrics such as MAE and RMSE. These findings underscore the utility of Yule-Walker and Burg-type estimators in constructing efficient and interpretable models for complex temporal data.


Model Discovery with Grammatical Evolution. An Experiment with Prime Numbers

arXiv.org Artificial Intelligence

Machine Learning produces efficient decision and prediction models based on input-output data only. Such models have the form of decision trees or neural nets and are far from transparent analytical models, based on mathematical formulas. Analytical model discovery requires additional knowledge and may be performed with Grammatical Evolution. Such models are transparent, concise, and have readable components and structure. This paper reports on a non-trivial experiment with generating such models.


PromptMap: An Alternative Interaction Style for AI-Based Image Generation

arXiv.org Artificial Intelligence

Recent technological advances popularized the use of image generation among the general public. Crafting effective prompts can, however, be difficult for novice users. To tackle this challenge, we developed PromptMap, a new interaction style for text-to-image AI that allows users to freely explore a vast collection of synthetic prompts through a map-like view with semantic zoom. PromptMap groups images visually by their semantic similarity, allowing users to discover relevant examples. We evaluated PromptMap in a between-subject online study ($n=60$) and a qualitative within-subject study ($n=12$). We found that PromptMap supported users in crafting prompts by providing them with examples. We also demonstrated the feasibility of using LLMs to create vast example collections. Our work contributes a new interaction style that supports users unfamiliar with prompting in achieving a satisfactory image output.


Ext2Gen: Alignment through Unified Extraction and Generation for Robust Retrieval-Augmented Generation

arXiv.org Artificial Intelligence

RAG has proven its effectiveness in reducing hallucinations We go beyond accurate retrieval to emphasize in LLMs, when their knowledge is incomplete, robust generation that remains resilient to forgetting outdated, or lacks sufficient detail to accurately and distraction by the two challenges. Our key address specific queries (Gao et al., 2023b; idea for enhancing robustness is an extract-thengenerate Fan et al., 2024). A critical aspect of RAG is the approach, Ext2Gen, where the model "retrieval" process, which involves identifying and first extracts query-relevant sentences from the retrieved selecting relevant text chunks. The quality of these chunks and then refine the information to retrieved chunks plays a pivotal role in the overall generate a precise answer. The extraction step here performance of RAG, as they form the basis serves as a chain-of-thought (CoT) process (Wei for generating factual and contextually relevant answers et al., 2022; Chu et al., 2023), where the model provides aligned with the query intent (Asai et al., the evidence first before generating the final 2024; Wang et al., 2023; Zhang et al., 2024).


Improving Hate Speech Classification with Cross-Taxonomy Dataset Integration

arXiv.org Artificial Intelligence

Algorithmic hate speech detection faces significant challenges due to the diverse definitions and datasets used in research and practice. Social media platforms, legal frameworks, and institutions each apply distinct yet overlapping definitions, complicating classification efforts. This study addresses these challenges by demonstrating that existing datasets and taxonomies can be integrated into a unified model, enhancing prediction performance and reducing reliance on multiple specialized classifiers. The work introduces a universal taxonomy and a hate speech classifier capable of detecting a wide range of definitions within a single framework. Our approach is validated by combining two widely used but differently annotated datasets, showing improved classification performance on an independent test set. This work highlights the potential of dataset and taxonomy integration in advancing hate speech detection, increasing efficiency, and ensuring broader applicability across contexts.


Foundation Models -- A Panacea for Artificial Intelligence in Pathology?

arXiv.org Artificial Intelligence

The role of artificial intelligence (AI) in pathology has evolved from aiding diagnostics to uncovering predictive morphological patterns in whole slide images (WSIs). Recently, foundation models (FMs) leveraging self-supervised pre-training have been widely advocated as a universal solution for diverse downstream tasks. However, open questions remain about their clinical applicability and generalization advantages over end-to-end learning using task-specific (TS) models. Here, we focused on AI with clinical-grade performance for prostate cancer diagnosis and Gleason grading. We present the largest validation of AI for this task, using over 100,000 core needle biopsies from 7,342 patients across 15 sites in 11 countries. We compared two FMs with a fully end-to-end TS model in a multiple instance learning framework. Our findings challenge assumptions that FMs universally outperform TS models. While FMs demonstrated utility in data-scarce scenarios, their performance converged with - and was in some cases surpassed by - TS models when sufficient labeled training data were available. Notably, extensive task-specific training markedly reduced clinically significant misgrading, misdiagnosis of challenging morphologies, and variability across different WSI scanners. Additionally, FMs used up to 35 times more energy than the TS model, raising concerns about their sustainability. Our results underscore that while FMs offer clear advantages for rapid prototyping and research, their role as a universal solution for clinically applicable medical AI remains uncertain. For high-stakes clinical applications, rigorous validation and consideration of task-specific training remain critically important. We advocate for integrating the strengths of FMs and end-to-end learning to achieve robust and resource-efficient AI pathology solutions fit for clinical use.


EXALT: EXplainable ALgorithmic Tools for Optimization Problems

arXiv.org Artificial Intelligence

Algorithmic solutions have significant potential to improve decision-making across various domains, from healthcare to e-commerce. However, the widespread adoption of these solutions is hindered by a critical challenge: the lack of human-interpretable explanations. Current approaches to Explainable AI (XAI) predominantly focus on complex machine learning models, often producing brittle and non-intuitive explanations. This project proposes a novel approach to developing explainable algorithms by starting with optimization problems, specifically the assignment problem. The developed software library enriches basic algorithms with human-understandable explanations through four key methodologies: generating meaningful alternative solutions, creating robust solutions through input perturbation, generating concise decision trees and providing reports with comprehensive explanation of the results. Currently developed tools are often designed with specific clustering algorithms in mind, which limits their adaptability and flexibility to incorporate alternative techniques. Additionally, many of these tools fail to integrate expert knowledge, which could enhance the clustering process by providing valuable insights and context. This lack of adaptability and integration can hinder the effectiveness and robustness of the clustering outcomes in various applications. The represents a step towards making algorithmic solutions more transparent, trustworthy, and accessible. By collaborating with industry partners in sectors such as sales, we demonstrate the practical relevance and transformative potential of our approach.


Foundation Model of Electronic Medical Records for Adaptive Risk Estimation

arXiv.org Artificial Intelligence

We developed the Enhanced Transformer for Health Outcome Simulation (ETHOS), an AI model that tokenizes patient health timelines (PHTs) from EHRs. ETHOS predicts future PHTs using transformer-based architectures. The Adaptive Risk Estimation System (ARES) employs ETHOS to compute dynamic and personalized risk probabilities for clinician-defined critical events. ARES incorporates a personalized explainability module that identifies key clinical factors influencing risk estimates for individual patients. ARES was evaluated on the MIMIC-IV v2.2 dataset in emergency department (ED) settings, benchmarking its performance against traditional early warning systems and machine learning models. We processed 299,721 unique patients from MIMIC-IV into 285,622 PHTs, with 60% including hospital admissions. The dataset contained over 357 million tokens. ETHOS outperformed benchmark models in predicting hospital admissions, ICU admissions, and prolonged hospital stays, achieving superior AUC scores. ETHOS-based risk estimates demonstrated robustness across demographic subgroups with strong model reliability, confirmed via calibration curves. The personalized explainability module provides insights into patient-specific factors contributing to risk. ARES, powered by ETHOS, advances predictive healthcare AI by providing dynamic, real-time, and personalized risk estimation with patient-specific explainability to enhance clinician trust. Its adaptability and superior accuracy position it as a transformative tool for clinical decision-making, potentially improving patient outcomes and resource allocation in emergency and inpatient settings. We release the full code at github.com/ipolharvard/ethos-ares to facilitate future research.


Massive Values in Self-Attention Modules are the Key to Contextual Knowledge Understanding

arXiv.org Artificial Intelligence

Large language models (LLMs) have achieved remarkable success in contextual knowledge understanding. In this paper, we show that these concentrated massive values consistently emerge in specific regions of attention queries (Q) and keys (K) while not having such patterns in values (V) in various modern transformer-based LLMs (Q, K, and V mean the representations output by the query, key, and value layers respectively). Through extensive experiments, we further demonstrate that these massive values play a critical role in interpreting contextual knowledge (i.e., knowledge obtained from the current context window) rather than in retrieving parametric knowledge stored within the model's parameters. Our further investigation of quantization strategies reveals that ignoring these massive values leads to a pronounced drop in performance on tasks requiring rich contextual understanding, aligning with our analysis. Finally, we trace the emergence of concentrated massive values and find that such concentration is caused by Rotary Positional Encoding (RoPE), which has appeared since the first layers. These findings shed new light on how Q and K operate in LLMs and offer practical insights for model design and optimization.